Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task04 Кудрявцев Федор HSE #147

Open
wants to merge 1 commit into
base: task04
Choose a base branch
from

Conversation

koufesser
Copy link

@koufesser koufesser commented Oct 5, 2024

Локальный вывод

Транспонирование

C:\Users\koufe\GPGPUTasks2024\cmake-build-debug\matrix_transpose.exe 1
OpenCL devices:
  Device #0: CPU. 13th Gen Intel(R) Core(TM) i7-13700H. Intel(R) Corporation. Total memory: 16003 Mb
  Device #1: GPU. Intel(R) Iris(R) Xe Graphics. Total memory: 6401 Mb
Using device #1: GPU. Intel(R) Iris(R) Xe Graphics. Total memory: 6401 Mb
Data generated for M=4096, K=4096
[matrix_transpose_naive]
    GPU: 0.00703333+-0.000515321 s
    GPU: 2385.39 millions/s
[matrix_transpose_local_bad_banks]
    GPU: 0.00731667+-0.000465176 s
    GPU: 2293.01 millions/s
[matrix_transpose_local_good_banks]
    GPU: 0.00723333+-0.000422953 s
    GPU: 2319.43 millions/s

Перемножение

C:\Users\koufe\GPGPUTasks2024\cmake-build-debug\matrix_multiplication.exe 1
OpenCL devices:
  Device #0: CPU. 13th Gen Intel(R) Core(TM) i7-13700H. Intel(R) Corporation. Total memory: 16003 Mb
  Device #1: GPU. Intel(R) Iris(R) Xe Graphics. Total memory: 6401 Mb
Using device #1: GPU. Intel(R) Iris(R) Xe Graphics. Total memory: 6401 Mb
Data generated for M=1024, K=1024, N=1024
CPU: 5.769+-0 s
CPU: 0.346681 GFlops
[naive, ts=4]
    GPU: 0.0505+-0.00111803 s
    GPU: 39.604 GFlops
    Average difference: 0.000196008%
[naive, ts=8]
    GPU: 0.0278333+-0.000372678 s
    GPU: 71.8563 GFlops
    Average difference: 0.000196008%
[naive, ts=16]
    GPU: 0.021+-0.00057735 s
    GPU: 95.2381 GFlops
    Average difference: 0.000196008%
[local, ts=4]
    GPU: 0.0338333+-0.00146249 s
    GPU: 59.1133 GFlops
    Average difference: 0.000196008%
[local, ts=8]
    GPU: 0.0196667+-0.000471405 s
    GPU: 101.695 GFlops
    Average difference: 0.000196008%
[local, ts=16]
    GPU: 0.0265+-0.000957427 s
    GPU: 75.4717 GFlops
    Average difference: 0.000196008%
[local wpt, ts=4, wpt=2]
    GPU: 0.0691667+-0.00226691 s
    GPU: 28.9157 GFlops
    Average difference: 0.000196008%
[local wpt, ts=4, wpt=4]
    GPU: 0.111833+-0.00681705 s
    GPU: 17.8838 GFlops
    Average difference: 0.000196008%
[local wpt, ts=8, wpt=2]
    GPU: 0.0123333+-0.000471405 s
    GPU: 162.162 GFlops
    Average difference: 0.000196008%
[local wpt, ts=8, wpt=4]
    GPU: 0.0256667+-0.00449691 s
    GPU: 77.9221 GFlops
    Average difference: 0.000196008%
[local wpt, ts=8, wpt=8]
    GPU: 0.0201667+-0.000897527 s
    GPU: 99.1736 GFlops
    Average difference: 0.000196008%
[local wpt, ts=16, wpt=2]
    GPU: 0.0151667+-0.000687184 s
    GPU: 131.868 GFlops
    Average difference: 0.000196008%
[local wpt, ts=16, wpt=4]
    GPU: 0.01+-0.00057735 s
    GPU: 200 GFlops
    Average difference: 0.000196008%
[local wpt, ts=16, wpt=8]
    GPU: 0.00883333+-0.000372678 s
    GPU: 226.415 GFlops
    Average difference: 0.000196008%
[local wpt, ts=16, wpt=16]
    GPU: 0.0123333+-0.000471405 s
    GPU: 162.162 GFlops
    Average difference: 0.000196008%

Вывод Github CI

Транспонирование

OpenCL devices:
  Device #0: CPU. AMD EPYC [7](https://github.com/GPGPUCourse/GPGPUTasks2024/pull/147/checks#step:7:8)763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
Using device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
Data generated for M=4096, K=4096
[matrix_transpose_naive]
    GPU: 0.0171[8](https://github.com/GPGPUCourse/GPGPUTasks2024/pull/147/checks#step:7:9)68+-0.00135369 s
    GPU: 976.169 millions/s
[matrix_transpose_local_bad_banks]
    GPU: 0.0238586+-0.0003[9](https://github.com/GPGPUCourse/GPGPUTasks2024/pull/147/checks#step:7:10)1206 s
    GPU: 703.194 millions/s
[matrix_transpose_local_good_banks]
    GPU: 0.0280841+-0.000134503 s
    GPU: 597.393 millions/s

Перемножение

OpenCL devices:
  Device #0: CPU. AMD EPYC [7](https://github.com/GPGPUCourse/GPGPUTasks2024/pull/147/checks#step:8:8)763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
Using device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
Data generated for M=1024, K=1024, N=1024
CPU: 6.25+-0 s
CPU: 0.32 GFlops
[naive, ts=4]
    GPU: 0.246773+-0.000759[8](https://github.com/GPGPUCourse/GPGPUTasks2024/pull/147/checks#step:8:9)67 s
    GPU: 8.10463 GFlops
    Average difference: 0.000149043%
[naive, ts=8]
    GPU: 0.261947+-0.002[9](https://github.com/GPGPUCourse/GPGPUTasks2024/pull/147/checks#step:8:10)847 s
    GPU: 7.63512 GFlops
    Average difference: 0.000149043%
[naive, ts=16]
    GPU: 0.267871+-0.00295813 s
    GPU: 7.46629 GFlops
    Average difference: 0.000149043%
[local, ts=4]
    GPU: 0.555525+-0.0012201 s
    GPU: 3.6002 GFlops
    Average difference: 0.000149043%
[local, ts=8]
    GPU: 0.315502+-0.00441561 s
    GPU: 6.33911 GFlops
    Average difference: 0.000149043%
[local, ts=16]
    GPU: 0.285373+-0.00140375 s
    GPU: 7.00836 GFlops
    Average difference: 0.000149043%
[local wpt, ts=4, wpt=2]
    GPU: 0.5184+-0.00304857 s
    GPU: 3.85802 GFlops
    Average difference: 0.000149043%
[local wpt, ts=4, wpt=4]
    GPU: 0.438042+-0.00168949 s
    GPU: 4.56577 GFlops
    Average difference: 0.000149043%
[local wpt, ts=8, wpt=2]
    GPU: 0.31639+-0.00250521 s
    GPU: 6.32132 GFlops
    Average difference: 0.000149043%
[local wpt, ts=8, wpt=4]
    GPU: 0.283483+-0.00238618 s
    GPU: 7.05509 GFlops
    Average difference: 0.000149043%
[local wpt, ts=8, wpt=8]
    GPU: 0.26[10](https://github.com/GPGPUCourse/GPGPUTasks2024/pull/147/checks#step:8:11)17+-0.00169234 s
    GPU: 7.66233 GFlops
    Average difference: 0.000149043%
[local wpt, ts=16, wpt=2]
    GPU: 0.220607+-0.000865029 s
    GPU: 9.0659 GFlops
    Average difference: 0.000149043%
[local wpt, ts=16, wpt=4]
    GPU: 0.194[11](https://github.com/GPGPUCourse/GPGPUTasks2024/pull/147/checks#step:8:12)2+-0.000522242 s
    GPU: 10.3033 GFlops
    Average difference: 0.000[14](https://github.com/GPGPUCourse/GPGPUTasks2024/pull/147/checks#step:8:15)9043%
[local wpt, ts=16, wpt=8]
    GPU: 0.285621+-0.00175619 s
    GPU: 7.00228 GFlops
    Average difference: 0.000149043%
[local wpt, ts=[16](https://github.com/GPGPUCourse/GPGPUTasks2024/pull/147/checks#step:8:17), wpt=16]
    GPU: 0.337311+-0.00248971 s
    GPU: 5.92924 GFlops
    Average difference: 0.000149043%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant